An Exploration of Formalized Retrieval Heuristics

نویسندگان

  • Hui Fang
  • Tao Tao
  • ChengXiang Zhai
چکیده

Empirical studies of information retrieval methods show that good retrieval performance is closely related to the use of various retrieval heuristics, such as TF-IDF weighting. Any effective retrieval formula, no matter how it is originally motivated, also often boils down to an explicit or implicit implementation of these heuristics. One basic research question is thus what are exactly these “necessary” heuristics that seem to cause good retrieval performance. In this paper, we present a formal study of these retrieval heuristics. We formally define a set of basic desirable constraints that any reasonable retrieval function should satisfy, and check these constraints on a variety of representative retrieval functions. We find that none of these retrieval functions satisfies all the constraints unconditionally. Empirical results show that when a constraint is not satisfied, it often indicates non-optimality of the method, and when a constraint is only satisfied for a certain range of parameter values, its performance tends to be poor when the parameter is out of the range. In general, we find that the empirical performance of a retrieval formula is tightly related to how well they satisfy these constraints. Thus the proposed constraints can provide a good explanation of many empirical observations and make it possible to evaluate any existing or new retrieval formula analytically.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Exploration of Formalized Information Retrieval Heuristics

Empirical studies of information retrieval methods show that good retrieval performance is closely related to the use of various retrieval heuristics, such as TF-IDF weighting. Any effective retrieval formula, no matter how it is originally motivated, also often boils down to an explicit or implicit implementation of these heuristics. One basic research question is thus what are exactly these “...

متن کامل

Applying Heuristics to Improve A Genetic Query Optimisation Process in Information Retrieval

This work presents a genetic approach for query optimisation in information retrieval. The proposed GA is improved y heuristics in order to solve the relevance multimodality problem and adapt the genetic exploration process to the information retrieval task. Experiments with AP documents and queries issued from TREC show the effectiveness of our GA model

متن کامل

Critical Systems Heuristics (CSH) to Deal with Stakeholders' Contradictory Viewpoints of Iran Performance Based Budgeting System

Objective: Performance based budgeting is an undeniable necessity for effective management of the country vital resources nowadays, which benefits all economic and social layers of the society if properly implemented. Accordingly, this has encouraged lots of studies and researches on PPB theories, concepts and models. This study deeply reviewed Iran’s PBB system within four basic issues, includ...

متن کامل

Coordinating Order Acceptance and Batch Delivery for an Integrated Supply Chain Scheduling

This paper develops Order Acceptance for an Integrated Production-Distribution Problem in which Batch Delivery is implemented. The aim of this problem is to coordinate: (1) rejecting some of the orders (2) production scheduling of the accepted orders and (3) batch delivery to maximize Total Net Profit. A Mixed Integer Programming is proposed for the problem. In addition, a hybrid meta-heuristic...

متن کامل

Multiple query evaluation based on an enhanced genetic algorithm

Recent studies suggest that significant improvement in information retrieval performance can be achieved by combining multiple representations of an information need. The paper presents a genetic approach that combines the results from multiple query evaluations. The genetic algorithm aims to optimise the overall relevance estimate by exploring different directions of the document space. We inv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003